Chromatin Immunoprecipitation Sequencing ◾ 239
annotations_edb2$ENTREZID
<- as.character(annotations_edb2$ENTREZID)
chip2_annot %>% left_join(annotations_edb2,
by=c(“geneId”=”ENTREZID”)) %>%
write.table(file=”Chip2_peak_annotation.txt”,
sep=”\t”, quote=F, row.names=F)
# Write Chip3 annotation to a file
chip3_annot <- data.frame(annotated_peaks[[“chip3”]]@anno)
entrez3 <- chip3_annot$geneId
annotations_edb3 <- AnnotationDbi::select(EnsDb.Hsapiens.v75,
keys = entrez3,
columns = c(“GENENAME”),
keytype = “ENTREZID”)
annotations_edb3$ENTREZID
<- as.character(annotations_edb3$ENTREZID)
chip3_annot %>% left_join(annotations_edb3,
by=c(“geneId”=”ENTREZID”)) %>%
write.table(file=”Chip3_peak_annotation.txt”,
sep=”\t”, quote=F, row.names=F)
Those files will be saved in the working directory. Open these files in Excel to study their
contents.
6.3.8 ChIP-Seq Functional Analysis
After peak annotation and functional enrichment, the next step in ChIP-Seq data analysis
is the identification of the biological implications of the expression of the genes associated
with the sites. For this purpose, we will use knowledge from Gene Ontology (GO) and
KEGG pathway and other pathway databases as we did in the RNA-Seq data analysis. The
GO enrichment analysis is performed using “enrichGO()”, which is one of the “clusterPro-
filer” package [10]. This function will return the enrichment GO categories based on the
FDR threshold. The following codes perform GO analysis for each sample and then write
the results into a file and create a dot plot showing a specified number of the top GO terms.
The dot size on the plot represents the number of the genes related to GO term divided by
the total number of significant genes and the size of the circle describes the significance in
adjusted p-value (p-adjusted).
ego1 <- enrichGO(gene = entrez1,
keyType = “ENTREZID”,
OrgDb = org.Hs.eg.db,
ont = “BP”,
pAdjustMethod = “BH”,
qvalueCutoff = 0.05,
readable = TRUE)
ego2 <- enrichGO(gene = entrez2,